-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add support for schema evolution #329
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a stab at this. A few comments from a cursory review.
Note: I can keep working on this task unless you want to take charge. |
I tried the following approach to handle type's promotion:
The draft uses |
10b955d
to
4ade51d
Compare
Its an interesting approach. Will look at it in more detail this weekend. First impression is that it is quiet verbose, but I dont have a better idea at this point, so this is already great work. |
Thanks for the quick feedback! |
So i had some time to take a look at the last commits. First off, very nice progress here. It is not as verbose as I had previously thought. One thing I have noticed is that everything happens at runtime, which will carry a heavy performance penalty. However, we have all the information needed to move this from a runtime problem to a planning problem. My suggestion is to change the codecs to use a specific "conversion" function, for example:
This both cleans up the functions a little, and means we know up front what this function looks like. It will still take a performance hit from calling the indirect function, but I suspect this should be quiet small, but will need to be benchmarked. For string <-> byte conversion, there are unsafe ways to make this alloc-less. The converter creators could look something like:
Leaving the planning to look something like this:
Thoughts? |
Your solution looks cleaner and doesn’t use Regarding the Another point (a bit opinionated) I was thinking about is that |
Fair point. Either the nil value can be handled, or the empty
I tend to disagree. The |
My point is specific to
|
Ah, I misunderstood. You are correct here. Generic reading should be in the codec level. It would be more consistent. |
I’ll update the draft first to consider your proposed approach, then figure out how to handle the |
1877e31
to
e89e8c8
Compare
e89e8c8
to
845d478
Compare
I integrated the "convert function idea" in my draft and ended up moving away from "moving readNext logic to codec-level" to keep the focus on schema evolution. If the type promotion handling design looks good, then the remaining central point to fix should be handling default value at the decoder-level: I made some cleanups and slightly reduced verbosity by considering that the Any thoughts or suggestions for improvement or a better solution are welcome! |
To avoid duplicating the type's promotion and default value logic in case of dynamic decoder, I tried to replace Replace: pObj := (*any)(ptr)
obj := *pObj
*pObj = r.ReadNext(d.schema) With: pObj := (*any)(ptr)
obj := *pObj
receiverPtr, receiverTyp, err := dynamicReceiver(d.schema, r.cfg.resolver)
if err != nil {
r.ReportError("Read", err.Error())
return
}
decoderOfType(r.cfg, d.schema, receiverTyp).Decode(receiverPtr, r)
*pObj = receiverTyp.UnsafeIndirect(receiverPtr) Benchmark:
|
b8aac9b
to
84bdd49
Compare
c17c6aa
to
7db00da
Compare
The draft looks good to me and can be turned into PR. One point I didn't receive your feedback about is the replacement of |
I see the need here, because of the compatibility changes. I would prefer it in a separate PR though, just to reduce the change scope a little, and make it simpler to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off, great work 🎉 Sorry for the delay, I have been away.
I am doing this review in stages as I have time, as it is quiet large.
This is the first round.
Take your time to review it, and be merciless 🙂. It’s a large PR where errors could slip through. |
Co-authored-by: Nicholas Wiersma <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎉
I misused That's because the generic decoding bypasses the cache and instantiates a new decoder in each call: I'm not sure If I should do something in this regard. |
I am not super stressed. There is a lot in the lib that is difficult if not impossible to test. |
The main idea is to generate a
composite schema
by combining both read and write schemas:The composite schema essentially contains an optional field-level attribute
field.action
, which can take on one of two values:DRAIN
orSET_DEFAULT
.field.action
is considered during the record decoding process, where it is used to skip reading the encoded value or force setting the default value.To set the default value, I added a
createDefaultDecoder
function that deals with different schema types and ensures the conversion of the default value.Also, I made a change to
SchemaCompatibility.Compatible
function to consider field aliases.fixes #20